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(57) ABSTRACT 

A system, method, and software product dynamically deter- 
mine network applications associated with any ports being 
used by packets on a network, allowing the packets to be 
properly routed, cotmted, and reported according to their 
applications. In one embodiment, an application-port map- 
ping table stores static associations or mappings between 
applications and ports, as defined by a standaMs body or 
other source. The appUcation-port mapping table is dynami- 
cally updated during runtime to reflect dynamic associations 
between appUcations and ports as extracted from packet 
data. The associations are identified by a packet analysis 
module which performs a two step verification of an appli- 
cation for a packet. In a first step, the packet analysis module 
applies the ports from a packet to the application-port 
mapping table to obtain a first application identifier. In a 
second, separate step, the packet analysis module applies 
identification logic to the packet to identify an application 
based on packet data. The second step may be used for each 
packet or only where the packet is not identified by the 
application-port mapping table. If a second application is 
successfully identified, the packet analysis module updates 
the application-port mapping table by adding a new asso- 
ciation between the second identified application, and a port 
of the packet. To keep the application-port mapping table 
current, the table is periodically scanned to remove associa- 
tions which have expired; alternatively, an association is 
removed when an end of sequence packet for its application 
is detected. 

11 Claims, 6 Drawing Sheets 
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AUTOMATIC IDENTIFICATION OF number is used) to the application protocol HTTP (or other 

APPLICATION PROTOCOLS THROUGH protocol). This requires the user of network management 

DYNAMIC MAPPING OF APPLICATION- software to know in advance the ports being used, in order 

PORT ASSOCIATIONS to configure the mapping. This is not a satisfactory solution 

5 since it is generally not possible to know the mappings ahead 

BACKGROUND of time. In addition, the mappings can change at any time. 

1 Field of Invention Thus, preconfiguring the network management software 

, „ ^ with a fixed set of mapping means that software docs not 

The present invention relates generally to the field of havcgreatflexibility to deal with new mappings. Id addition, 

computer network management, and more parUcularly, to ^^fj^^^^ ^^^^ .j^^^.^ ^^-^^ 

systems, methods, and software ax products for idcntifymg jy preconfigured mappings, 

network apphcations for proccssmg packet data. . . . , . r*i. v 

^ *^ Other schemes require instrumenlation of the apphcation 

2. Background of the Invention ysi^g protocol to determine the port mappings it is using, 

Conventional network apphcations in computer networks or the interception of common protocol interfaces, such as in 

running over common layer 3 (network layer) protocols use 15 Microsoft Corp/s WinSock, to attempt to determine the 

static port mappings. A port mapping defines the specific mapping required. Neither of these methods provide a 

port or socket number that network traffic for a specific non-intrusive or configuration-free method of identifying 

network application is routed to, or sent from. These port application traffic from a raw stream of data on a computer 

mappings arc available from the Internet Assigned Numbers network. 

Authority (lANA) in order to differentiate between applica- 20 Accordingly, it is desirable to provide a system, method 

tions. These application-port mappings arc pubhshed and and software product that can dynamicaUy map application 

are well defined, and typically used to deliver network traffic and port relationships and thereby correctly identify 

to the appropriate apphcations. applications, protocols and other network data fi-om packet 

When ports are defined statically in relationship to data, 
applications, it is possible for a network monitor, or similar ^ 

product, to reliably identify traffic by looking at a single SUMMARY OF THE INVENTION 
packet and checking its port number against the lANA (or The present invention overcomes the limitations of con- 
similar) list. These static mappings are thus relied upon by ventional systems and methods by dynamically identifying 
network management software products to determine the the relationships between apphcations and ports in received 
protocols and apphcations being used for network traffic, 30 packet data, and by storing and updating these rclaUonships. 
and therefore compile information about traffic patterns, Iq this manner, the present invention is able to identify the 
demand, latency, and other performance characteristics. correct applications for packets that use either standard, 
However, a number of network protocols allow for the use default port numbers, or non-standard, dynamically defined 
of dynamic application-port mappings. These protocols port numbers. 

allow a packet to be sent firom or directed to any available in one embodiment, the present invention provides a 

port, not only the statically defined ports. As a result, computer-implemented method for dynamically determin- 

reliance on the static application-port mappings cannot ing a network application for a stream of network packets, 

guarantee accurate decoding and use of packets, or accurate Aplurality of associations between network apphcations and 
analysis of network traffic. ^ ports are stored, for example, in a high speed cache. Apacket 

For example, FIG. 1 illustrates the TCP header for a is received from a network packet source. If there is a stored 

packet carrying HTTP traffic. The HTTP protocol (defined in association between a network apphcation and a port of the 

RFC 1945) generally uses the port 80 when running over packet (as contained in the packet header), then the packet 

TCP (defined in RFC 793). Traffic for the HTTP protocol is is provided to the network apphcation so identified, 

thus identified by examining the source or destination ports jf there is no stored association between a network 

in the TCP header for the value "80". This value identifies apphcation and a port of the packet, the packet data, for 

the packet as being a TCP packet for the HTTP protocol. example the payload and/or header, is analyzed to identify a 

However, the URL specification for HTTP (defined in network application for handhng the packet, and a new 

RFC 1738) allows a dynamic port to be added to the end of association is stored between this identified network apph- 

a URL request. For example, a URL for HTTP may have the cation and a port of the packet. This analysis may be made 

form "http://www.comp any.com: 8080", where the value even if there was a previously stored association between a 

"8080" identifies the port number to be used. FIG. 2 illus- port of the packet and a network application, 

trates the TCP header in this case. With a port of 8080 it is Additionally, stored associations may have expiration 

not possible using conventional network management tools times, based on a timestamp of the last packet that matched 
to identify this as HTTP traffic, since the standard port is not 55 the port in a stored association, and a timeout value, 

being used. Periodically, the stored associations arc processed to remove 

This problem has existed at least since the introduction of associations that have expired, according their expiration 

the HTTP protocol in 1991. This problem also occurs with times. 

other protocols, such as FTP, and NNTP, and for any of the in one embodiment, the present invention operates in 
many IP protocols. Indeed, dynamic ports are frequenUy 50 software products such as network monitors or protocol 

used, for example to provide for security or for improved decoders. Such products are configured as follows. An 

resource sharing. Accordingly, there is a need to be able to application-port mapping table is used to store the associa- 

handle dynamic mappings for network traffic. tions between network applications and ports. Each network 

Several approaches have been attempted to solve this apphcation is uniquely associated with, and identified by, 
problem. Some existing solutions require that network man- 65 one apphcation identifier. A special identifier is used to 

agement software be manually configured to add the map- represent an "unknown" apphcation. The application-port 

ping of the new port (e.g. 8080 or whatever other port mapping table is initialized with a set of static application- 
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port mappings, such as defined by a standards authority. The dynamic associations will always be present in table. In one 

application-portmappingtable will be subsequently updated embodiment, the newly created entries in the application- 

with new application-port associations as these are extracted port mapping table have an expiration time. The packet 

from actual packet data that is received and processed. The analysis module periodically checks the application-port 

application-port mapping table is preferably stored in high 5 mapping table, and removes associations that have expired, 

speed memory during nmtime. In one embodiment, the expiration time of an application- 

During runtime, packet data is received by a packet port association is based on the timestamp of the last packet 

analysis module. The packet analysis module extracts from matched the port of the association, plus a local timeout 

the packet its destination port and source port values. The value (a fixed number of seconds). The expiration time may 

packet analysis module applies these port values to the lO be compared to either a current time, or the timestamp of the 

application-port mapping table. If the application-port map- last received packet to determine whether it has expired. In 

ping table contains an association with a port matching one manner, the table is maintained with the most current 

of the ports for the packet, then the packet is positively associations between applications and ports, as determined 

identified, and the application-port mapping table returns the packet data itself. 

application identifier associated with the matching port. This 15 Xhe use of expiration limes further allows for another 

first apphcalion identifier initially identifies the application optimization of the packet analysis module's performance, 

for handling the packet. Since the expiration time of each application-port associa- 

In a conventional system, as explained above, if a packet ^^n ensures that the application-port mapping table is 

includes a port value which that is not static value for the always current, in one embodiment, the packet analysis 

appropriate application, i.e. a dynamic port, the packet will ^ module processes the packet a second fime only when the 

not be properly identified as being appropriate for the application-port mapping table is unable to identify the 

application (the packet may still be received by the appli- packet. In this manner, the packet analysis module updates 

cation since other mechanisms may be preserving this state application-port mapping table only as needed, which is 

information). when the packet cannot be matched to an application-port 

In the present invention, the applicaUon-port mapping ^ °^Wing that has been previously detected, 

table will initially not contain an association between the Th^ present invention may be implemented in a number 

dynamic port and an application identifier, and thus the °f different manners. One implementation uses an object 

application for the packet will be unknown. However, unlike oriented collection of objects to provide the described func- 

conventional approaches, the packet analysis module then tionality and features. The application-port mapping table is 

analyzes the packet to determine the appropriate application implemented as a collection of application map objects, each 

for handling the packet. The analysis is preferably of the storing an application identifier and port association, and an 

pay load, but may also include the packet header. The aoaly- optional expiration time where garbage collection is used, 

sis is generally based on predefined data pattems that are Th® packet analysis module is implemented as an applica- 

uniquely associated with each application. This analysis source object and a collection of application identifier 

may include, for example, determining whether the payload objects. Each of the application identifier objects imple- 

includes a valid URL to an HTTP address, thereby indicating ^^^^ a method that encodes the logic used to identify a 

an rniP application. Other payload specific data or pattems specific application from the packet data, 

may also be recognized. The application map objects are initialized with a set of 

Once the packet analysis module successfully identifies standard, static application identifier-port associations. The 

the packet as belonging to a specific application, it creates a application source object receives a packet from a packet 

new entry in the application-port mapping table, associating source object, and traverses the collection of application 

a port number from the packet with the application identifier map objects to obtain a first application identifier from the 

appropriate to the identified application. The packet analysis application map object having a matching port. If none of 

module also checks to make sure that no duplicate mappings 45 application map objects match the packet, then the 

for the port exist in the application-port mapping table. The apphcation is deemed "unknown." 

next time a packet is received with the newly mapped port The application source object also traverses the applica- 

number, the application-port mapping table will correctly tion identifier objects, having each one process the packet to 

identify it. In this manner, the packet analysis module separately identify the application. Each application identi- 

dynamically updates the application-port mapping table to 5Q fier object returns cither an application identifier of the 

include associations between application identifiers and specific application it recognizes or an "unknown" if the 

ports extracted from packet data. application identifier object does not recognize the packet. 

In some instances, even if the application-port mapping If the packet is recognized by one of the application 

table does contain an association between an application identifier objects, then the application source object instan- 

identifierandaport, this association may no longer be valid. 55 tiales a new application map object with the application 

This is particularly true for associations that were dynami- identifier returned by the application identifier object which 

cally created, as described above, since these tend to be recognized the packet, and a port number from the packet, 

specific to individual transactions. In one embodiment then, The application source object then passes the packet to the 

after the application -port mapping table has provided the second, identified application. Summary data, may now be 

application identifier, the packet analysis module separately 50 accurately accumulated for all packets, to accurately moni- 

analyzes the packet, as above. If this analysis is successful tor network performance, 

and an application is identified, this application will be given „„„„ 

the packet, regardless of the application identified by the ^^^^^ DESCRIPTION OF THE DRAWINGS 

application-port mapping table. FIG. 1 is the header of an HTTP packet over TCP/IPusing 

In one embodiment, outdated associations defined in the 65 a standard port mapping, 

application-port mapping table are periodically removed FIG. 2 is the header of an HTTP packet over TCP/IPusing 

through a garbage collection process, so that the current a non-standard port mapping. 
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FIG. 3 is a functional model of a system in accordance 
with the present invention. 

FIG, 4 is an illustration of a hardware architecture for 
implementing the present invention. 

FIG. 5 is an object model of a software product in 5 
accordance with the present invention. 

FIG. 6 is a flowchart of a method for managing the 
identification of an application for a packet. 

FIG. 7 is a flowchart of an optimized method for man- 
aging the identification of application for a packet. 

DETAILED DESCRIPTION OF THE 
I^fVENT^ON 

Referring now to FIG. 3 there is shown a functional model 
of the present invention for identifying applications and 15 
network protocols from raw packet data. In this 
embodiment, the present invention includes a packet analy- 
sis module 100, an application identifier module 102, and an 
application-port mapping table 104. These elements may be 
included in various dififerent products, such as network 20 
monitors, protocol decoders, protocol analyzers, routers, 
brouters, bridges, or the like. 

The application-port mapping table 104 stores associa- 
tions between application identifiers and port numbers. Each 
application identifier identifies a different appUcation avail- 25 
able on the computer (or network of computers for distrib- 
uted applications) capable of receiving or processing the 
packet. Each apphcation has a unique identifier, A special 
'unknown' application identifier (e.g. "APP_ 
UNKNOWN") is used to represent an unknown application, 30 
though a mapping for the unknown application need not be 
stored in the table. The application-port mapping table 104 
is cached in memory for high speed access. 

Upon startup, the application-port mapping table 104 is 
initialized 213 with some number of static application-port 35 
associations 108, These static assodatioas (or 'mappings') 
are based on the published specifications for the various 
applications and protocols, and their standard port nimibers. 
Table 1 includes a preferred list of network applications, 
application identifiers, and their standard ports used to 40 
define the static application-port associations 108, 



TABLE 1 


Application 


Port 


Application Identifier 


HTTP 


80 


APP_HTTP 


FTP 


20 


APP_FTP 


FTP 


21 


APP_FTP 


SMTP 


25 


APP_SMTP 


NNTT 


119 


APP_J^NnT 


POP3 


110 


APP_POP3 



The application-port mapping table 104 will be later 
updated 205 by the packet analysis module 100 with 
dynamic associations 112 between apphcation identifiers 55 
and ports based on the actual packet data that is received 
from the packet source 106. The packet source 106 is any 
useful source of packet (layer 3) data. 

The packet analysis module 100 receives 201 raw packet 
data from the network packet source 106, typically through eo 
some network interface, such as a network interface card. 
The packet analysis module 100 decomposes this packet, 
extracting various field data from the packet. The extracted 
data includes the source port, the destination port from the 
packet header, and the pay load. 65 

The packet analysis module 100 appfies 203 the port 
numbers from the packet to application-port mapping table 



104 to obtain an application identifier. If there is an asso- 
ciation that matches the port data, then application-port 
mapping table 104 returns 204 the application identifier for 
the network application. The application-port mapping table 
104 will be able to return an application identifier if either 
the port number is a standard port number for the application 
and included in the original set of static application-port 
mappings, or is a new application identifier-port mapping 
extracted fi^om the received packets, and stored in the 
application-port mapping table. If there is no matching port, 
then the packet analysis module 100 assigns the result as 
being the unknown application identifier. 

In one embodiment, even if the application-port mapping 
table 104 returns 204 an appUcation identifier for a specific 
application, the present invention provides an additional 
sequence of processing steps to verify this application, 
which in some instances may be incorrect. This is done as 
follows. The packet analysis module 100 passes 207 the 
packet to the application identifier module 102. This module 
102 analyzes the packet to determine 209 an application 
appropriate for handling the packet. 

The application identifier module 102 attempts to match 
portions of the packet data, including the payload or header, 
with defined patterns or attributes for each of a plurality of 
different applications. Generally, for each application there 
are one or more bit patterns which uniquely identify the 
application. These patterns may be in the payload data, in the 
header, or in combinations of both. The bit patterns are 
defined as sequences of <offset, bit pattems> tuples. Each 
application may have one or more such sequences which 
tiniquely identifies it. Table 2 includes preferred bit patterns 
used to identify selected network applications, along with 
the port used to map the apphcation into the application-port 
mapping table 104. Alternate patterns are shown numbered. 



TABLE 2 






Port Used 


Application 


Data Patterns <bit offset, data pattcm> 


in Mapping 


ETTTP 


<0, -GET'xAjURLxn, "EnTP/l."> 


Destination 


NNTP 


<0, *'GR0UP"><6, newsgroupxn, CRLF> 


Destination 




<n + 2, En(lofFrame> 




SMTP 


1) <0, "220_"><4, server_namc> 


Source 




<n, "Simple Mall Transfer Service Rcady"> 






2) <0, "MAIL FROM: <"><11, email_Addr> Destination 




<n, CRLF> 






3) <0. "RCPTTO: <"><9, cmail_addr> 


Destination 




<n, CRLFS 




P0P3 


1) <0, "+0K P0P3 server rcady'*><22. 


Source 




CRLF> 






2) <0, "+0K**><4, numberxn, "messages 






("><n+10, numbcrxm, "octcts)"> 


Source 


FTP 


1) <a, **230 User logged in"> 


Source 




2) <0, **227 Entering Passive Mode ("><26, 


Source 




q], n2, n3, n4, n5, n6><n,")*'> 






[where n,<256} 






3) <0, "P0Rr'<5, nl, d2, n3, aA, n5, d6> 


Destination 




[where ni<256] 





Here URL, newgroup, server^name, emaiL_addr, and 
number are vaUd forms as defined by their respective 
protocols. The identification of these patterns may done 
Uirough various types of pattern matching algoritiims, or 
simple conditional testing. 

In one embodiment, the application identifier module 102 
processes the packet data until it positively identifies one 
application, and returns 209 the application identifier for this 
application. 

In another embodiment, the apphcation identifier module 
102 identifies all applications that may be appropriate and 
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applies a selection logic to select one of these as the most 
appropriate application, returning 209 the application iden- 
tifier for this application. One implementation of the selec- 
tion logic is to assign a confidence factor to each pattern or 
logic used to identify an application. When multiple appli- 5 
cations are identified, the application identifier module 102 
selects the application having the highest confidence factor. 

If the application identifier module 102 is unable to match 
the packet data with any applications, then it returns 209 the 
unknown application identifier to the packet analysis module 
100. Alternatively, the application identifier module 102 
returns a null result to the packet analysis module 100, 
which assigns an unknown application identifier to the 
outcome. 

If the application identifier returned by the appHcation 
identifier module 102 identifies an appUcation (i.e. the 
application is not unknown) then the packet analysis module 
100 creates 205 a new entry in the application-port mapping 
table 104 using the returned application identifier from the 
application identifier module 102 and a port number from 
the packet. The port used is determined by the specific 
method used to identify the appUcation, as shown in the 
examples of Table 2. The packet analysis module 100 checks 
to make sure that duplicate entries for a port are not present, 
replacing any previous association with the new association. ^ 
Where garbage collection is used, the packet analysis mod- 
ule 100 also sets an expiration time for this new association. 
Preferably the expiration time is equal to the timestamp of 
the packet plus a local timeout value. The new association 
will enable the application-port mapping table 104 to return 
203 the correct apphcation identifier for packets which use 
the same port number, which will be other than the standard 
default ports for that particular application. In this manner, 
the application-port mapping table 104 is continually 
updated to reflect the dynamic application-port associations 
present in the packet data, and not merely the static asso- 
ciations set by the standards authorities, such as the lANA. 
When a subsequent packet, typically in the same transaction, 
is received, the packet analysis module 100 will be able to 
obtain the correct application identifier from the application- ^ 
port mapping table 104 immediately, and then verify it in the 
application identifier module 102. 

After the packet is identified, it is provided 211 to the 
specified apphcation for processing. This processing may 
include any type of analysis or use of the packet. 

If the application identifier module 102 was not able to 
identify the packet, then the application identified 203 by the 
application-port mapping table 104 is used to invoke 211 an 
application for processing the packet. 

If neither the application-port mapping table 104 or the 
application identifier module 102 is able to identify a packet, 
then the packet is discarded, or alternatively, passed to a 
default application for handling unknown packets. 

Where garbage collection is used, the packet analysis 55 
module 100 periodically scans the application-port mapping 
table 104 to check the expiration time of each association. If 
an association has expired, then the packet analysis module 
100 removes it from the application-port mapping table 104. 
In one embodiment, the packet analysis module 100 stores 60 
a global timeout value, which defines a fixed number of 
seconds. This global timeout value is preferably 10 to 20 
times the local timeout value. Each time a packet is received, 
the packet analysis module 100 compares the diflfercnce 
between the timestamp of the current packet and the prcvi- 65 
ous packet to the global timeout value. If the difference in 
timestamps exceeds the global timeout value, then the 
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packet analysis module 100 removes the expired associa- 
tions from the application-port mapping table 104, as 
described. This may be done as the packet analysis module 
100 queries the application -port mapping table 104 for a 
matching port 

The expiration times of the application-port mapping 
table 102 and garbage collection process of the packet 
analysis module 100 enables a further performance optimi- 
zation of the packet analysis module 100. Since the garbage 
collection ensures that the application-port mapping table 
104 always maintains the current application-port 
associations, the packet analysis module 100 can decide to 
invoke the application identifier module 102 only when the 
application-port mapping table 104 is unable to identify an 
application from the ports of the packet. In this way, the 
application identifier module 102 is used only as needed to 
create a new association in the application-port mapping 
table 104. This implementation provides for higher through- 
put by the packet analysis module 100. 

The present invention may be implemented in different 
types of software products, using various software 
architectures, which execute on standard computer hardware 
platforms. 

Refening now to FIG. 4, there is shown one embodiment 
of a system architecture supporting a software product 
embodying the present invention. A preferred system 
includes a general purpose computer 300, having a conven- 
tional processor 301, addressable memory 303, and input 
and output devices (not shown). The addressable memory 
303 includes a conventional operating system 302, such as 
UNIX®, or Microsoft Windows®, or the like. The addres- 
sable memory 303 further includes a software product 304 
structured in accordance with the present invention. The 
software product 304 may be a network monitor, protocol 
decoder, protocol analyzer, or the like. A network interface, 
such as a network interface card 306, couples the computer 
to a conventional network 308. 

Referring now to FIG. 5, there is shown an object model 
for one implementation of the software architecture of the 
software product 304. The main entities are as follows: 

EApphcation: EApplication is an enumerated type of 
application identifiers. These identifiers are defined 
uniquely, one per appUcation. The identifiers listed above 
with respect to Table 1 may be used. The actual values 
themselves are not significant, only the uniqueness of each 
identifier. A special application identifier, here called 
"APP_UNKNOWN" is used to represent an application that 
could not be identified by the system. 

CPackctSource: The CPacketSource is used as a generic 
source of packet data for the system. This object may be 
instantiated as a way to read trace files or to receive data 
from a network interface card. The CPacketSource provides 
three methods to access data: 

Start: This method initializes the system to begin receipt 
of packet data. 

Receive: This method returns a packet, in the form of a 
CPacket object. 

Stop: This method shuts down the receipt of packets. 

CPacket: The CPacket represents a packet of information 
broken out into useful information about the packet. A 
CPacket object is returned by the CPacketSource from the 
Receive method. The CPacket has the following pubUc 
member variables: 

Protocol: Stores the protocol type in the packet, for 
example. TCP/IP, IPX, SPX. 
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SourcePort: Stores the source port or socket number. Start: This method creates a CPacketSource object and 

DestinationPort: Stores gives the destination port or invokes its Start method, 

socket number. Stop: This method invokes the Stop method of the CPack- 

Payload: Stores the payload carried by the packet after the ^^^P "=«=iP^ P^^«t ^a^- 

protocol header has been removed by CPacketSource. ^ Destructor: This method removes the CAppUcationMap 

CApplicationMap: The CApplicationMap is an object CApplicationldentificr coUcctions. 

that represents an association between an appUcation idcn- Receive: This method performs the work of managing the 

tifier in an EAppHcation and a port or socket number. A identification of a packet. 

collccUon of CApplicationMap objects is instantiated for the ^ lU^^rates one embodmicnt of the Receive method, 

statically defined ports and sockets, as defined by a standards ^h^ CApphcationSource's Receive method invokes 601 the 

authority or other source. The CApplicationSource object. Receive method of the CPacketSource object to obtain a 

described below, maintains the collection, and adds new P^^^^^ ^^^^^^ i° * CPacket. When a packet is received, the 

CAppUcationMap instances when application to port asso- information known is the port numbers, the appUcaUon 

ciations arc dynamically discovered from the packet data. ^ known. 

The coUection of CApplicationMap objects represent the Receive then traverses 603 CApplicationMap collection, 

application-port mapping table 104. passing the port numbers of the CPacket to each CAppHca- 

Each CApplicationMap contains the foUowing member tionMap object, and obtaining an application identifier, if 

variables' ^ match. This search is to identify the application for 

AppUcation: a reference to the EApplication with the P*^'''.'' static appUcation-port associations or 

appropriate appUcation identifier ^ ^""^ ^uTJ^HS^l ^"""^l'^ ''y"'^"' associations. If there is 

_ _ . u r • * match (60S) between the port numbers of the packet and 

Port: The port number for the association CApplicationMap objects, then the returned appHca- 

Time: an expiration time used to remove dynamic entries identifier is stored 607 as a first appUcation identifier, 

frorn the collection when they are no longer deemed ^^^^ ^ gtored as Appl. If none of the CAppUcationMap 

^7 1.* . . ./> 25 objects matches, then the only information known about the 

CApphcationidentifier: The CAppUcationldentifier packet is the transport protocol, and so the Receive method 

objects perform the analysis of the payload or header data of ^^^^ ^09 the first appUcation identifier to APP_ 

a CPacket to identify the appUcation for a packet. There is UNKNOWN 

one CAppUcationidentifier object for each different type of embodiment, even if the first application identifier 

application. Each CApphcationidentifier object has a single 30 ^ ^ot unknown, the Receive method performs a verification 

method, IdenUfy, which is passed a CPacket and analyses its ^^tieck on the appUcation using the CAppUcationldentifier 

payload and/or header to return an CAppUcationMap for an ^^^^^^ g^st appUcation identifier is APP_ 

application which is appropriate for the packet. The method UNKNOWN, then identification using the Identify method 

of Identification is parUcular for each appUcation to be ^^comes particularly significant, since otherwise the packet 

idenUfied, dependmg on various types of characteristics in 35 ^^1 not be correctly identified- In fact, with many packets, 

the payload or header data. Table 2, above lists the various ^^^^ application idenUfier will be APP_UNKNOWN 

charactenstics useful for identifying different applications. the application itself is maintaining its state with 

These charactensUcs may be encoded in the logic of each respect to the packet stream as a whole, and waiting for a 

object s Identify method as desired. terminating packet. For a product 304 such as a network 

An example of an Identify method for identifying an ^ ^^^^^ j^aintain such stale, protocol identificaUon of 

HTTP packet is: ^^^^ ^^^^^^ ^ necessary. 

Accordingly, the Receive method then traverses 611 the 
CAppUcationldentifier objects to obtain a second appUca- 

if (Payload « "GET <vaiid-URL>nrrp/i.") identifier from at least one of these objects, based on the 

Application - APP _inTP 45 payload or header data for the packet. The Receive method 

Port - DestinationPort calls 611 the Identify method of each (or several) CAppU- 

1- t.^r^^r^..r^r catiorudcntifier object, and passes in the CPacket object. 

Application - APP_UNKNOWN c u i- *• tj *-c u- * -n . * -j .t 

Each CApphcationidentifier object will attempt to identify 
the packet as being of its own type, and return a second 

The collection of CApphcationidentifier objects represent 50 appUcation identifier for the identified appUcation. 

the appUcation identifier module 102. In one embodiment, the Receive method maintains a local 

CAppUcationPacket: This object derives from CPacket, variable App2, which is updated with the returned second 

and contains an additional member function Application (of appUcation identifier from the CAppUcationldentifier 

type EApplication) which represents the application idcnti- objects. In one embodiment, when App2 takes on any value 

ficd firom the packet. This function invokes the specified 55 other than APP_UNKNOWN, the Receive method stops, 

application. A CAppUcationPacket instance is created by the and uses this value to identify the application for the packet. 

CAppUcationSource (described below) after the appUcation Alternatively, the Receive method traverses all of the 

for the packet has been identified by a CApplicationldenti- CAppUcationldentifiers, and if multiple matches are 

fier object or a CAppUcationMap. obtained, selects one of the appUcation idenUfiers as the 

CAppUcationSource: This object controls the operation of eo second application identifier, using a selection logic, for 

product 304, and implements the packet analysis module example, based on confidence factors for matching appUca- 

100. It has the foUowing major functions: Uon identifiers. 

Constructor: The constructor method creates a collection If the second application identifier is not APP_ 

of CApplicationMap and CAppUcationldentifier UNKNOWN (613), the Receive method then creates 615 a 

objects. The coUection of CAppUcationMap objects is 65 new CApplicationMap object to store the association 

initiaUzcd with the static appUcation -port mappings, for between a port number in the CPacket and the second 

example as shown in Table 1 . appUcation identifier for the idenUfied appUcaUon. The port 
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used, whether destination or source, depends on the particu- 
lar method of identification for the application. Typically, the 
destination port is used where an identification is made finm 
a client request, and the source port is used where an 
identification is made from a server reply. 

Prior to instantiating the new CApplicationMap object, 
the Receive method removes any existing CApplicationMap 
object having the same port number, thereby preventing any 
duplicate ports from appearing in the collection of CAppli- 
cationMap objects. Alternatively, if the Receive method 
identifies a previously existing CApplicationMap object 
with the same application identifier, it updates its port with 
the current port from the packet. 

If none of the CApplicationldcntifier objects is able to 
identify the CPacket, then the second application identifier 
wiU remain APP_UNKNOWN (613). In this case, if first 
application identifier is also APP_UNKNO WN (621), then 
the packet is either discarded or passed 623 to a default 
application for handling. 

If the second application identifier is not APP_ 
UNKNOWN, then the application for the packet has been 
identified The Receive method instantiates 617 a new 
CApplicationPacket object, storing therein the packet data 
from the CPacket object, and the reference to the appropriate 
EApplication for the application identifier. This application 
identifier will be the first application identifier if the CAp- 
plicationldentifier objects were not able to identify the 
packet from the payload data. Otherwise, it will be the 
second apphcation identifier. The application represented by 
the EApplication value is passed the CApplicationPacket 
object. 

In an alternative embodiment, the CApplicationldentifier 
objects may be replaced by an array of "patterns", each of 
which define a mapping from the pattern to the EApplica- 
tion. For example a pair of patterns for the HTTP protocol 
may be: 

<SourcePort, DestinationPort, Pattern, EApplication, 
DynamicPort> 
80, APP_JITTP, 80 

*, 80, APP_HTTP, 80 

*, "GET<vand-url> HTTP/l.", APP__HTTP, Destination- 
Port 

The first two lines represent the static mapping of port 80 
to the application APP_HTTP (where means 
"anything") and the last line represents the dynamic map- 
ping when the payload contains a line of the form GET 
<valid-url> HTTP/x.y, assuming that the <valid-iirl> is a 
special pattern representing any URL as defined in URL 
specification for HTTP (RFC 1738). This implementation 
makes it easier for the developer of the system to alter the 
mappings since they may easily be stored in a text file that 
is loaded and processed when the CApplicationSource con- 
structor is called. To use this implementation, the CAppli- 
cationSource sequentially processes the lines and patterns to 
find the right match of port and EApplication. 

Referring now to FIG. 7, there is shown an optimized 
embodiment of the Receive method, employing garbage 
collection of the CApplicationMap objects and selective use 
of the CApplicationldentifier objects. In this 
implementation, the CApplicationSource maintains a global 
timeout value, typically on the order of 1 to 5 seconds, and 
maintains a timestamp difference variable which is the 
difference in timestamps between a currently received 
packet and a previously received packet. 

The Receive method invokes 701 the Receive method of 65 
the CPackctSource object to obtain a packet. The Receive 
method then compares 703 the timestamp difference for the 
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current packet with the global timeout value. If the times- 
tamp dijOference is greater than the global timeout, then 
sufficient time has elapsed since the previous packet was 
received to warrant garbage collection of the expired CAp- 
plicationMap objects. Accordingly, the Receive method 
traverses 705 the CApplicationMaps to obtain an application 
identifier, and as each CApplicationMap object is called, the 
Receive method checks the expiration time of the object 
against the current time, or alternatively, against the times- 
tamp of the current packet. If the expiration time of the 
CApplicationMap object is less than the current time/ 
timestamp, then the association between the application and 
port in the CApplicationMap is expired, and the Receive 
method removes 707 that object. 

If the timestamp difference does not exceed the global 
timeout, then the Receive method traverses 711 the CAp- 
plicationMap objects without removing 707 expired objects. 

If a CApplicationMap object is able to identify the packet 
by matching 709 a port of the packet, then the first appli- 
cation identifier is set 713 to the returned value of the 
CApplicationMap object. In addition, the Receive method 
updates 715 the expiration time of that CApplicationMap 
object, preferably setting the expiration time equal to the 
timestamp of the packet plus the local timeout value (0.1 to 
1 second preferably). At this point the packet is identified, 
and since the CApplicationMap objects are always current 
due to the garbage collection, there is no need to invoke the 
CApplicationldentifier objects to identify the packet. The 
Receive method creates 727 a new CApplicationPacket 
object with the packet and application identifier information, 
and passes 729 the CApplicationPacket to the identified 
application. This optimization increases the throughput of 
the system, and updates the CApplicationMap objects to 
always reflect the dynamic mappings of the application to 
ports as reflected by the packet data. 

If the CApplicationMap objects were not able to identify 
the packet, then the Receive method sets 719 the first 
application identifier to unknown, and traverses 721 the 
CApplicationldentifier objects to identify the packet, and 
obtain a second application identifier, as described above. If 
the second application identifier is APP_UNKNOWN (723) 
then the packet is either discarded or passed 717 to a default 
application. 

If the CApplicationldentifier objects were able to identify 
the application for the packet, and thus the second applica- 
tion identifier is not APP_UNKNOWN, then the Receive 
method creates 725 a new CApplicationMap object, asso- 
ciating the second application identifier with the matching 
port of the packet. 

The Receive method then creates 727 a new CApplica- 
tionPacket object with the packet and apphcation identifier 
information, and passes 729 the CApplicationPacket to the 
identified application. 

The above described embodiment for the CApplication- 
Map objects uses a expiration time remove dynamic asso- 
ciations from the CApplicationMap coUection when they 
have not been accessed for more than some fixed number of 
seconds. An alternative method of garbage collection is to 
detect the appearance of each protocol's own "end of 
sequence" identification as a signal to remove an association 
from the application-port mapping table 102, via the appli- 
cable CApplicationMap object. For example, in TCP, the 
FIN flag in the header may be scanned for, and upon its 
detection, the appropriate entry removed. This implementa- 
tion is shghtly slower, but also slightly more memory 
efficient. 

In summary then, there is described a system, method, and 
software product for dynamicaUy mapping the assodatioos 
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between applications and ports as extracted from the packet 
data being analyzed. The present invention thereby provides 
for improved accuracy in the detection and accounting of 
trafi&c data, and in the ability to accurately report and 
manage such traflBc. 5 
I claim: 

1. A computer implemented method for dynamically 
determining a network application for a stream of network 
packets, each packet including a header having a source port 
and a destination port, and a pay load, the method compris- lo 
ing: 

storing an application-port map containing a plurality of 
associations between application identifiers and ports, 
each application identifier identifying a network appli- 
cation; 15 

determining from the application-port map a first appli- 
cation identifier associated with a port of a received 
packet, the first application identifier identifying a first 
network application; 

processing the received packet lo identify a second net- 20 
work application for handling the received packet by 
applying an identification logic to the payload of the 
received packet to detect a data pattern within the 
received packet indicative of the second network appli- 
cation; 25 

responsive to successfully identifying a second network 
application by the detection of the data pattern within 
the received packet: 

setting an application identifier of the second network 
application as a second application identifier; -^^ 

creating a new association in the application-port asso- 
ciating the second application with the port of the 
packet; and 

passing the received packet to the second network 
application; 

and responsive to unsuccessfully identifying a second 
network application, passing the packet to the first 
network apphcation. 

2. The method of claim 1, further comprising: 
establishing an expiration time for the new association in ^ 

the application-port map; and 
periodically removing fi^om the application-port map 
associations that have expired according to their expi- 
ration times. 

3. The method of claim 1, further comprising: 
detecting a packet indicating an end of a sequence of 

packets for a selected network application; and 
removing from the application-port map an association 
including an application identifier for the selected net- 50 
work application. 

4. The method of claim 1, wherein the processing of the 
received packet to identify the second apphcation identifier 
for handling the received packet, further comprises: 

applying selected data from the payload to a plurality of 55 
patterns, each pattern including a sequence of data 
uniquely indicative of a network application, until the 
selected data matches a pattern; and setting an appli- 
cation identifier of the second network application 
whose pattern matches the selected data as the second eo 
application identifier. 

5. The method of claim 1, wherein determining firom the 
application-port map the first application identifier associ- 
ated with the port of the received packet, further comprises: 

determining a timestamp difference between a timestamp 65 
of the received packet and a timestamp of a previous 
packet; and 
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responsive to the timestamp dilEierence exceeding a pre- 
determined threshold, removing firom the application- 
port map associations having expiration times earlier 
than a current time. 

6. A computer implemented method for dynamically 
determining a network apphcation for a stream of network 
packets, each packet including a header having a source port 
and a destination port, and a payload, the method compris- 
ing: 

storing an application-port map containing a plurality of 
associations between application identifiers and ports, 
each application identifier identifying a network apph- 
cation; 

determining whether the application-port map includes an 

association between an apphcation identifier and a port 

of a received packet; 
determining a timestamp difference between a timestamp 

of tiie received packet and a timestamp of a previous 

packet; 

responsive to the timestamnp difference exceeding a 
predetermined threshold, removing from the 
application-port map associations having expiration 
times earlier than a current time; and 

responsive to the application-port map including an apph- 
cation identifier associated with the port of the received 
packet, updating an expiration time of association 
including the apphcation identifier as a function of a 
timestamp of the received packet; 

responsive to the appUcation-port map including the 
application identifier associated with the port of the 
received packet, passing the received packet to the first 
network apphcation, and receiving a next packet; and 

responsive to the apphcation-port map not including the 
application identifier associated with the port of the 
packet: 

processing the received packet to identify a second 
network apphcation for handling the received padcet 
by detecting a data pattern within the received packet 
indicative of the second network application; 

creating a new association in the application-port map 
associating a second application identifier identify- 
ing the second network application with the port of 
the received packet; and 

passing the received packet to the second network 
application. 

7. A system for dynamically determining a network apph- 
cation for a stream of network packets, each packet includ- 
ing a header having a source port and a destination port, and 
a payload, the system comprising: 

an application-port mapping table comprising a plurality 
of first application-port mapping objects, each object 
storing an association between an application identifier 
and a port, each object returning a respective applica- 
tion identifier in response to a port, and each apphca- 
tion identifier identifying a network apphcation; 

a packet analysis module that receives a packet from a 
network source and applies a port from the received 
packet to the apphcation-port mapping table to obtain 
a first application identifier, the first apphcation iden- 
tifier indicating a first network application for process- 
ing the packet; and 

an application identifier module that receives the received 
packet from the packet analysis module and determines 
from the received packet a second application identifier 
indicating a second network application by detecting a 



02/10/2004, EAST Version: 1.4.1 



us 6,182, 

15 

data pattern within the received packet indicative of the 
second network application, the application identifier 
module comprising a plurality of application identifi- 
cation objects, each appUcation identification object 
including an identify method that encodes logic for 5 
identifying a selected network application from pay- 
load or header data of the received packet, and return- 
ing an application identifier associated with the selected 
application, 

wherein the packet analysis module queries the 
application-port mapping objects to obtain the first 
application identifier, and queries the application iden- 
tifier objects to obtain the second application identifier, 
and responsive to the second application identifier 
indicating the second network application, creates a 15 
new association in the application-port mapping table 
between the second application identifier for the second 
network application and the port of the received packet, 
and provides the received packet to the second network 
application, and responsive to the second application 20 
identifier indicating an unknown application, provides 
the packet to the first network application. 

8. The system of claim 7, wherein the packet analysis 
module establishes an expiration lime for tiae new associa- 
tion in the application-port mapping table, and periodically 25 
removes from the application-port mapping table associa- 
tions which have expired according to their expiration times. 

9. The system of claim 7, wherein: 

the packet analysis module detects a packet indicating an 
end of a sequence of packets for a selected network 
application, and removes from the application-port 
mapping table an association including an application 
identifier for the selected network application. 

10. A system for dynamically determining a network 
application for a stream of network packets, each packet 
including a header having a source port and a destination 
port, and a payload, the system comprising: 

an application-port mapping table containing a plurality 
of associations between application identifiers and ^ 
ports, each application identifier identifying a respec- 
tive network application; 

a packet analysis module that receives a packet from a 
network source and applies a port from the received 
packet to the application-port mapping table to obtain 45 
a first application identifier, the first apphcation iden- 
tifier indicating a first network application for process- 
ing the received packet, and responsive to the 
application-port mapping table including an association 
between a first application identifier and the part of the 
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received packet, provides the received packet to the 
first network application, and responsive to the 
application-port mapping table not including an asso- 
ciation between an application identifier and the port of 
the received packet, provides the received packet to an 
application identifier module; and 
the application identifier module that receives the 
received packet from the packet analysis module and 
determines from the received packet a second applica- 
tion identifier identifying a second network application 
for processing the received packet by applying an 
identification logic to the payload of the received 
packet to detect a data pattern within the received 
packet indicative of the second network application, 
wherein the packet analysis module, responsive to the sec- 
ond application identifier indicating a second network 
application, sets an application identifier of the second 
network application as the second application identifier, 
creates a new association in the application-port mapping 
table between the second appUcation identifier and the port 
of the received packet, and provides the received packet to 
the second network apphcation. 

11. A computer implemented method for dynamically 
determining a network apphcation for a stream of network 
packets, each packet including a header having a source port 
and a destination port, and a payload, the method compris- 
ing: 

storing a plurality of associations between network apph- 

cations and ports; 
responsive to a stored association between a first network 

application and a port of a received packet, providing 

the received packet to the first network application for 

processing; and 
responsive to an absence of a stored association between 

a first network application and the port of the received 

packet; 

analyzing the received packet to identify a second net- 
work application for handling the received packet by 
applying an identification logic to the payload of the 
received packet to detect a data pattern within the 
received packet indicative of the second network appU- 
cation; 

setting an application identifier of the second network 
application as a second application identifier, and 

storing a new association associating the second applica- 
tion identifier with the port of the received packet. 

* 4 * * <¥ 
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